Journal article

Compact inverted index storage using general-purpose compression libraries

M Petri, A Moffat

Software Practice and Experience | WILEY | Published : 2018

Abstract

Efficient storage of large inverted indexes is one of the key technologies that support current web search services. Here we re-examine mechanisms for representing document-level inverted indexes and within-document term frequencies, including comparing specialized methods developed for this task against recent fast implementations of general-purpose adaptive compression techniques. Experiments with the Gov2-URL collection and a large collection of crawled news stories show that standard compression libraries can provide compression effectiveness as good as or better than previous methods, with decoding rates only moderately slower than reference implementations of those tailored approaches...

View full abstract

University of Melbourne Researchers